Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Text-to-image synthesis method based on multi-level progressive resolution generative adversarial networks
XU Yining, HE Xiaohai, ZHANG Jin, QING Linbo
Journal of Computer Applications    2020, 40 (12): 3612-3617.   DOI: 10.11772/j.issn.1001-9081.2020040575
Abstract347)      PDF (1238KB)(352)       Save
To address the problem that the results of text-to-image synthesis tasks have wrong target structures and unclear image textures, a Multi-level Progressive Resolution Generative Adversarial Network (MPRGAN) model was proposed based on Attentional Generative Adversarial Network (AttnGAN). Firstly, a semantic separation-fusion generation module was used in low-resolution layer, and the text feature was separated into three feature vectors by the guidance of self-attention mechanism and the feature vectors were used to generate feature maps respectively. Then, the feature maps were fused into low-resolution map, and the mask images were used as semantic constraints to improve the stability of the low-resolution generator. Finally, the progressive resolution residual structure was adopted in high-resolution layers. At the same time, the word attention mechanism and pixel shuffle were combined to further improve the quality of the generated images. Experimental results showed that, the Inception Score (IS) of the proposed model reaches 4.70 and 3.53 respectively on datasets of Caltech-UCSD Birds-200-2011 (CUB-200-2011) and 102 category flower dataset (Oxford-102), which are 7.80% and 3.82% higher than those of AttnGAN, respectively. The MPRGAN model can solve the instability problem of structure generation to a certain extent, and the images generated by the proposed model is closer to the real images.
Reference | Related Articles | Metrics